状态空间模型已显示在建模远距离依赖性方面有效,特别是序列分类任务。在这项工作中,我们着重于对英语书籍,GitHub源代码和Arxiv数学文章的自回旋序列建模。基于围绕封闭激活功能的有效性的最新发展,我们提出了一个名为“封闭状态空间(GSS)”的新层,并表明它的训练速度明显快于TPU的S4(即DSS)的对角线版本,具有相当竞争力 - 基于变压器的基线,并表现出零击向更长的输入,同时直接实施。最后,我们表明,利用自我意见来建模局部依赖性,可以进一步提高GSS的性能。
translated by 谷歌翻译
我们开发了一个修改的在线镜下降框架,该框架适用于在无界域中构建自适应和无参数的算法。我们利用这项技术来开发第一个不受限制的在线线性优化算法,从而达到了最佳的动态遗憾,我们进一步证明,基于以下规范化领导者的自然策略无法取得相似的结果。我们还将镜像下降框架应用于构建新的无参数隐式更新,以及简化和改进的无限规模算法。
translated by 谷歌翻译
不受限制的在线线性优化(OLO)是研究机器学习模型培训的实用问题。现有作品提出了许多基于潜在的算法,但总的来说,这些潜在功能的设计在很大程度上取决于猜测。为了简化此工作流程,我们提出了一个框架,该框架通过求解部分微分方程(PDE)来生成新的潜在功能。具体来说,当损失是1-lipschitz时,我们的框架会产生一种新颖的算法,并随时随地遗憾绑定$ c \ sqrt {t}+|| || u || \ sqrt {2t} [\ sqrt {\ sqrt {\ log(1+|| |||/c)}+2] $,其中$ c $是用户指定的常数,$ u $是任何比较器未知和无限的先验者。这样的界限实现了最佳的损失重格折衷,而没有不切实际的tuble俩。此外,匹配的下限表明,包括常量乘数$ \ sqrt {2} $在内的领先订单项很紧。据我们所知,提出的算法是第一个实现此类最佳性的算法。
translated by 谷歌翻译
我们考虑在线线性优化问题,在每个步骤中,算法在单位球中播放点x_t $,损失$ \ langle c_t,x_t \ rangle $,x_t \ rangle $ for for some成本向量$ c_t $那么透露算法。最近的工作表明,如果算法接收到与$ C_T $之前的invial相关的提示$ h_t $,则它可以达到$ o(\ log t)$的遗憾保证,从而改善标准设置中$ \ theta(\ sqrt {t})$。在这项工作中,我们研究了算法是否真正需要在每次步骤中需要提示的问题。有些令人惊讶的是,我们表明,只需在自然查询模型下只需在$ O(\ SQRT {T})$暗示即可获得$ O(\ log t)$后悔;相比之下,我们还显示$ o(\ sqrt {t})$提示不能优于$ \ omega(\ sqrt {t})$后悔。我们为我们的结果提供了两种应用,以乐观的遗憾界限和弃权问题的乐观遗憾。
translated by 谷歌翻译
我们考虑使用一阶算法的非凸性随机优化,梯度估计可能具有重尾部。我们表明,当梯度只有有界限$ \ mathfrak {$ th} $ th moments为某些$时,梯度剪辑,动量和归一化梯度下降的组合产生了高概率的临界点,以获得最佳的损失的损失。 \ Mathfrak {P} \ in(1,2] $。我们考虑到二阶流畅损失的情况,在此设置中尚未研究我们的知识,并且再次获得任何$ \ Mathfrak的高概率界限{P} $。此外,我们的结果持有任意平稳规范,与需要Hilbert空间规范的典型SGD分析。此外,我们表明,在合适的“燃烧”时期之后,客观价值将单调减少对于每次迭代,直到识别临界点,这为学习率“预热”的流行实践背后提供了直觉,并且还产生了最后迭代的保证。
translated by 谷歌翻译
Foveated imaging provides a better tradeoff between situational awareness (field of view) and resolution and is critical in long-wavelength infrared regimes because of the size, weight, power, and cost of thermal sensors. We demonstrate computational foveated imaging by exploiting the ability of a meta-optical frontend to discriminate between different polarization states and a computational backend to reconstruct the captured image/video. The frontend is a three-element optic: the first element which we call the "foveal" element is a metalens that focuses s-polarized light at a distance of $f_1$ without affecting the p-polarized light; the second element which we call the "perifoveal" element is another metalens that focuses p-polarized light at a distance of $f_2$ without affecting the s-polarized light. The third element is a freely rotating polarizer that dynamically changes the mixing ratios between the two polarization states. Both the foveal element (focal length = 150mm; diameter = 75mm), and the perifoveal element (focal length = 25mm; diameter = 25mm) were fabricated as polarization-sensitive, all-silicon, meta surfaces resulting in a large-aperture, 1:6 foveal expansion, thermal imaging capability. A computational backend then utilizes a deep image prior to separate the resultant multiplexed image or video into a foveated image consisting of a high-resolution center and a lower-resolution large field of view context. We build a first-of-its-kind prototype system and demonstrate 12 frames per second real-time, thermal, foveated image, and video capture in the wild.
translated by 谷歌翻译
Reflections on glossy objects contain valuable and hidden information about the surrounding environment. By converting these objects into cameras, we can unlock exciting applications, including imaging beyond the camera's field-of-view and from seemingly impossible vantage points, e.g. from reflections on the human eye. However, this task is challenging because reflections depend jointly on object geometry, material properties, the 3D environment, and the observer viewing direction. Our approach converts glossy objects with unknown geometry into radiance-field cameras to image the world from the object's perspective. Our key insight is to convert the object surface into a virtual sensor that captures cast reflections as a 2D projection of the 5D environment radiance field visible to the object. We show that recovering the environment radiance fields enables depth and radiance estimation from the object to its surroundings in addition to beyond field-of-view novel-view synthesis, i.e. rendering of novel views that are only directly-visible to the glossy object present in the scene, but not the observer. Moreover, using the radiance field we can image around occluders caused by close-by objects in the scene. Our method is trained end-to-end on multi-view images of the object and jointly estimates object geometry, diffuse radiance, and the 5D environment radiance field.
translated by 谷歌翻译
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
translated by 谷歌翻译
本研究提出了一种新颖的趋势检测和可视化方法 - 更具体地说,随着时间的推移,主题的变化建模。如果当前用于识别和可视化趋势的模型仅传达基于用法随机计数的单一单词的普及,那么本研究中的方法说明了一个主题正在发展的普及和方向。在这种情况下,方向是选定语料库中的独特亚主题。通过使用K-均值聚类和余弦相似性对主题的移动进行建模来对这种趋势进行建模,以将簇之间的距离分组。在收敛的场景中,可以推断出整个主题是在网络上的(主题之间的令牌,可以互换)。相反,一个不同的场景暗示每个主题的各自的令牌在相同的上下文中都不会找到(彼此之间越来越不同)。该方法对20个新闻组数据集中存在的各种媒体房屋的一组文章进行了测试。
translated by 谷歌翻译
多限制攀岩机器人的运动计划必须考虑机器人的姿势,联合扭矩,以及它如何使用接触力与环境相互作用。本文着重于使用非传统运动来探索不可预测的环境(例如火星洞穴)的机器人运动计划。我们的机器人概念Reachbot使用可扩展和可伸缩的动臂作为四肢,在攀爬时实现了大型可伸缩度工作区。每个可扩展的动臂都由旨在抓住岩石表面的微生物抓地力封顶。 Reachbot利用其大型工作空间来绕过障碍物,裂缝和挑战地形。我们的计划方法必须具有多功能性,以适应可变的地形特征和鲁棒性,以减轻用刺抓握随机性质的风险。在本文中,我们引入了一种图形遍历算法,以根据适用于握把的可用地形特征选择一个离散的grasps序列。该离散的计划是由一个解耦运动计划者互补的,该计划者使用基于抽样的计划和顺序凸面编程的组合来考虑身体运动和最终效应器运动的交替阶段,以优化单个阶段。我们使用运动规划师在模拟的2D洞穴环境中计划轨迹,至少有95%的成功概率,并在基线轨迹上表现出改善的鲁棒性。最后,我们通过对2D平面原型进行实验来验证运动计划算法。
translated by 谷歌翻译